Quality

Testing and Quality Guide

The suite is based on unittest and covers sync/async execution, retries/timeouts, live startup retries, session metadata persistence, migration, locking, and observability hooks.

Run Unit Tests

python3 -m unittest discover -s tests -p "test_*.py"

Key files:

  • tests/test_codex_local_sdk.py
  • tests/test_client_sync_extended.py
  • tests/test_client_live_extended.py
  • tests/test_models_and_sessions_extended.py
  • tests/test_async_retry_and_session_store.py
  • tests/test_v2_hardening.py

Coverage Focus

AreaScenariosOutcome
Retry engineExit-code retries, jitter behavior, max retry window cutoff.Backoff behavior is deterministic and bounded.
Timeout supportTimeout result generation, retry-on-timeout enabled/disabled.Timeout path is explicit and policy-driven.
Live startup retriesLaunch exception retry, immediate-exit retry, no retry post-handle return.Startup resilience without post-stream side effects.
Session metadataRecord updates, bounded history truncation, named-session continuation.Persistent records remain compact and useful.
Migration + lockingLegacy file auto-migration, concurrent writer safety checks.JSON store remains valid and forward-compatible.
Telemetry hooksAttempt/retry/live/session events, hook exception swallowing.Observability is rich and non-disruptive.

Real CLI Integration Tests

Integration tests run real codex exec commands and are intentionally gated.

export CODEX_INTEGRATION=1
export CODEX_API_KEY=your_key_here  # optional if local Codex auth session is already valid
python3 -m unittest discover -s tests/integration -p "test_*.py"

Integration test module:

  • tests/integration/test_codex_cli_integration.py

CI Guidance

  • .github/workflows/unit.yml: runs full unit suite on push and pull request.
  • .github/workflows/integration.yml: manual workflow_dispatch run, secret-gated by CODEX_API_KEY.
  • Keep unit tests hermetic (mock CLI interactions).
  • Run integration tests only in controlled environments with secure credentials.
  • Document every new public API with tests covering success and failure paths.